Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 95
Filtrar
1.
Sci Rep ; 14(1): 6749, 2024 03 21.
Artículo en Inglés | MEDLINE | ID: mdl-38514716

RESUMEN

The corneal epithelium acts as a barrier to pathogens entering the eye; corneal epithelial cells are continuously renewed by uni-potent, quiescent limbal stem cells (LSCs) located at the limbus, where the cornea transitions to conjunctiva. There has yet to be a consensus on LSC markers and their transcriptome profile is not fully understood, which may be due to using cadaveric tissue without an intact stem cell niche for transcriptomics. In this study, we addressed this problem by using single nuclei RNA sequencing (snRNAseq) on healthy human limbal tissue that was immediately snap-frozen after excision from patients undergoing cataract surgery. We identified the quiescent LSCs as a sub-population of corneal epithelial cells with a low level of total transcript counts. Moreover, TP63, KRT15, CXCL14, and ITGß4 were found to be highly expressed in LSCs and transiently amplifying cells (TACs), which constitute the corneal epithelial progenitor populations at the limbus. The surface markers SLC6A6 and ITGß4 could be used to enrich human corneal epithelial cell progenitors, which were also found to specifically express the putative limbal progenitor cell markers MMP10 and AC093496.1.


Asunto(s)
Epitelio Corneal , Limbo de la Córnea , Humanos , Nicho de Células Madre , Células Madre Limbares , Córnea , Epitelio Corneal/metabolismo , Perfilación de la Expresión Génica
2.
J Mol Biol ; 435(14): 168093, 2023 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-37061086

RESUMEN

Protein structural domains have been less studied than full-length proteins in terms of ontology annotations. The dcGO database has filled this gap by providing mappings from protein domains to ontologies. The dcGO update in 2023 extends annotations for protein domains of multiple definitions (SCOP, Pfam, and InterPro) with commonly used ontologies that are categorised into functions, phenotypes, diseases, drugs, pathways, regulators, and hallmarks. This update adds new dimensions to the utility of both ontology and protein domain resources. A newly designed website at http://www.protdomainonto.pro/dcGO offers a more centralised and user-friendly way to access the dcGO database, with enhanced faceted search returning term- and domain-specific information pages. Users can navigate both ontology terms and annotated domains through improved ontology hierarchy browsing. A newly added facility enables domain-based ontology enrichment analysis.


Asunto(s)
Bases de Datos de Proteínas , Dominios Proteicos , Anotación de Secuencia Molecular , Fenotipo
3.
Nat Commun ; 14(1): 919, 2023 02 17.
Artículo en Inglés | MEDLINE | ID: mdl-36808136

RESUMEN

Cohort-wide sequencing studies have revealed that the largest category of variants is those deemed 'rare', even for the subset located in coding regions (99% of known coding variants are seen in less than 1% of the population. Associative methods give some understanding how rare genetic variants influence disease and organism-level phenotypes. But here we show that additional discoveries can be made through a knowledge-based approach using protein domains and ontologies (function and phenotype) that considers all coding variants regardless of allele frequency. We describe an ab initio, genetics-first method making molecular knowledge-based interpretations for exome-wide non-synonymous variants for phenotypes at the organism and cellular level. By using this reverse approach, we identify plausible genetic causes for developmental disorders that have eluded other established methods and present molecular hypotheses for the causal genetics of 40 phenotypes generated from a direct-to-consumer genotype cohort. This system offers a chance to extract further discovery from genetic data after standard tools have been applied.


Asunto(s)
Exoma , Predisposición Genética a la Enfermedad , Humanos , Fenotipo , Genotipo , Frecuencia de los Genes
4.
Nucleic Acids Res ; 51(D1): D418-D427, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36350672

RESUMEN

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. Here, we report recent developments with InterPro (version 90.0) and its associated software, including updates to data content and to the website. These developments extend and enrich the information provided by InterPro, and provide a more user friendly access to the data. Additionally, we have worked on adding Pfam website features to the InterPro website, as the Pfam website will be retired in late 2022. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB. Moreover, we report the development of a card game as a method of engaging the non-scientific community. Finally, we discuss the benefits and challenges brought by the use of artificial intelligence for protein structure prediction.


Asunto(s)
Bases de Datos de Proteínas , Humanos , Secuencia de Aminoácidos , Inteligencia Artificial , Internet , Proteínas/química , Programas Informáticos
5.
Comput Struct Biotechnol J ; 19: 3747-3754, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34285776

RESUMEN

Two major forces have contributed to the fast growth of human genetic data. One from medical research supported by governments and academic institutes; the other from direct-to-consumer (DTC) sequencing companies. While the former benefits from meticulously designed sequencing standards and quality control procedures, the latter comes in various formats and sequencing methods which are subject to changes over time and the particular needs of different companies. Thanks to the general public who shared their DNA data without constraint, here we provide a review for over 7000 genomes made public between 2011 and 2020, and produced by over six DTC sequencing companies. An open source tool-kit to systematically parse, quality check and filter genome files and statistically problematic alleles is provided to prepare consumer DNA datasets for research. The GenomePrep output is available in two common DNA datafile formats to enable further analysis with other tools. We also provide for download the combined output for all OpenSNP array genomes processed in this paper in a single data freeze file.

6.
Nucleic Acids Res ; 49(D1): D344-D354, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33156333

RESUMEN

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/química , Secuencia de Aminoácidos , COVID-19/metabolismo , Internet , Anotación de Secuencia Molecular , Dominios Proteicos , Mapas de Interacción de Proteínas , SARS-CoV-2/metabolismo , Alineación de Secuencia
7.
Methods Mol Biol ; 2165: 27-67, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32621218

RESUMEN

Genome3D consortium is a collaborative project involving protein structure prediction and annotation resources developed by six world-leading structural bioinformatics groups, based in the United Kingdom (namely Blundell, Murzin, Gough, Sternberg, Orengo, and Jones). The main objective of Genome3D serves as a common portal to provide both predicted models and annotations of proteins in model organisms, using several resources developed by these labs such as CATH-Gene3D, DOMSERF, pDomTHREADER, PHYRE, SUPERFAMILY, FUGUE/TOCATTA, and VIVACE. These resources primarily use SCOP- and/or CATH-based protein domain assignments. Another objective of Genome3D is to compare structural classifications of protein domains in CATH and SCOP databases and to provide a consensus mapping of CATH and SCOP protein superfamilies. CATH/SCOP mapping analyses led to the identification of total of 1429 consensus superfamilies.Currently, Genome3D provides structural annotations for ten model organisms, including Homo sapiens, Arabidopsis thaliana, Mus musculus, Escherichia coli, Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Plasmodium falciparum, Staphylococcus aureus, and Schizosaccharomyces pombe. Thus, Genome3D serves as a common gateway to each structure prediction/annotation resource and allows users to perform comparative assessment of the predictions. It, thus, assists researchers to broaden their perspective on structure/function predictions of their query protein of interest in selected model organisms.


Asunto(s)
Genómica/organización & administración , Bases del Conocimiento , Anotación de Secuencia Molecular/métodos , Proteoma/química , Animales , Arabidopsis , Genoma , Genómica/métodos , Humanos , Difusión de la Información , Alineación de Secuencia/métodos , Reino Unido , Levaduras
8.
Nucleic Acids Res ; 48(D1): D314-D319, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31733063

RESUMEN

Genome3D (https://www.genome3d.eu) is a freely available resource that provides consensus structural annotations for representative protein sequences taken from a selection of model organisms. Since the last NAR update in 2015, the method of data submission has been overhauled, with annotations now being 'pushed' to the database via an API. As a result, contributing groups are now able to manage their own structural annotations, making the resource more flexible and maintainable. The new submission protocol brings a number of additional benefits including: providing instant validation of data and avoiding the requirement to synchronise releases between resources. It also makes it possible to implement the submission of these structural annotations as an automated part of existing internal workflows. In turn, these improvements facilitate Genome3D being opened up to new prediction algorithms and groups. For the latest release of Genome3D (v2.1), the underlying dataset of sequences used as prediction targets has been updated using the latest reference proteomes available in UniProtKB. A number of new reference proteomes have also been added of particular interest to the wider scientific community: cow, pig, wheat and mycobacterium tuberculosis. These additions, along with improvements to the underlying predictions from contributing resources, has ensured that the number of annotations in Genome3D has nearly doubled since the last NAR update article. The new API has also been used to facilitate the dissemination of Genome3D data into InterPro, thereby widening the visibility of both the annotation data and annotation algorithms.


Asunto(s)
Proteínas/química , Bases de Datos de Proteínas , Proteínas/clasificación , Proteínas/genética , Interfaz Usuario-Computador
9.
Nucleic Acids Res ; 48(D1): D376-D382, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31724711

RESUMEN

The Structural Classification of Proteins (SCOP) database is a classification of protein domains organised according to their evolutionary and structural relationships. We report a major effort to increase the coverage of structural data, aiming to provide classification of almost all domain superfamilies with representatives in the PDB. We have also improved the database schema, provided a new API and modernised the web interface. This is by far the most significant update in coverage since SCOP 1.75 and builds on the advances in schema from the SCOP 2 prototype. The database is accessible from http://scop.mrc-lmb.cam.ac.uk.


Asunto(s)
Bases de Datos de Proteínas , Dominios Proteicos , Proteínas/química , Evolución Molecular , Internet , Proteínas/metabolismo , Programas Informáticos , Interfaz Usuario-Computador
10.
Genome Biol ; 20(1): 244, 2019 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-31744546

RESUMEN

BACKGROUND: The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. RESULTS: Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. CONCLUSION: We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.


Asunto(s)
Anotación de Secuencia Molecular/tendencias , Animales , Biopelículas , Candida albicans/genética , Drosophila melanogaster/genética , Genoma Bacteriano , Genoma Fúngico , Humanos , Locomoción , Memoria a Largo Plazo , Anotación de Secuencia Molecular/métodos , Pseudomonas aeruginosa/genética
11.
Methods Mol Biol ; 1975: C1, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31290135

RESUMEN

This chapter was published without including the "Conflict of Interest" section given by the author along with the corrected proof.

12.
Hum Mutat ; 40(9): 1373-1391, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31322791

RESUMEN

Whole-genome sequencing (WGS) holds great potential as a diagnostic test. However, the majority of patients currently undergoing WGS lack a molecular diagnosis, largely due to the vast number of undiscovered disease genes and our inability to assess the pathogenicity of most genomic variants. The CAGI SickKids challenges attempted to address this knowledge gap by assessing state-of-the-art methods for clinical phenotype prediction from genomes. CAGI4 and CAGI5 participants were provided with WGS data and clinical descriptions of 25 and 24 undiagnosed patients from the SickKids Genome Clinic Project, respectively. Predictors were asked to identify primary and secondary causal variants. In addition, for CAGI5, groups had to match each genome to one of three disorder categories (neurologic, ophthalmologic, and connective), and separately to each patient. The performance of matching genomes to categories was no better than random but two groups performed significantly better than chance in matching genomes to patients. Two of the ten variants proposed by two groups in CAGI4 were deemed to be diagnostic, and several proposed pathogenic variants in CAGI5 are good candidates for phenotype expansion. We discuss implications for improving in silico assessment of genomic variants and identifying new disease genes.


Asunto(s)
Biología Computacional/métodos , Variación Genética , Enfermedades no Diagnosticadas/diagnóstico , Adolescente , Niño , Preescolar , Simulación por Computador , Bases de Datos Genéticas , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Fenotipo , Enfermedades no Diagnosticadas/genética , Secuenciación Completa del Genoma
13.
Curr Biol ; 29(15): 2580-2585.e4, 2019 08 05.
Artículo en Inglés | MEDLINE | ID: mdl-31353185

RESUMEN

Although UVA radiation (315-400 nm) represents 95% of the UV radiation reaching the earth's surface, surprisingly little is known about its effects on plants [1]. We show that in Arabidopsis, short-term exposure to UVA inhibits the opening of stomata, and this requires a reduction in the cytosolic level of cGMP. This process is independent of UVR8, the UVB receptor. A cGMP-activated phosphodiesterase (AtCN-PDE1) was responsible for the UVA-induced decrease in cGMP in Arabidopsis. AtCN-PDE1-like proteins form a clade within the large HD-domain/PDEase-like protein superfamily, but no eukaryotic members of this subfamily have been functionally characterized. These genes have been lost from the genomes of metazoans but are otherwise conserved as single-copy genes across the tree of life. In longer-term experiments, UVA radiation increased growth and decreased water-use efficiency. These experiments revealed that PDE1 is also a negative regulator of growth. As the PDE1 gene is ancient and not represented in animal lineages, it is likely that at least one element of cGMP signaling in plants has evolved differently to the system present in metazoans.


Asunto(s)
Proteínas de Arabidopsis/genética , Arabidopsis/efectos de la radiación , GMP Cíclico/metabolismo , Fosfodiesterasas de Nucleótidos Cíclicos Tipo 1/genética , Rayos Ultravioleta , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Fosfodiesterasas de Nucleótidos Cíclicos Tipo 1/metabolismo , Transducción de Señal
14.
Methods Mol Biol ; 1975: 333-361, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31062318

RESUMEN

The process of identifying sets of transcription factors that can induce a cell conversion can be time-consuming and expensive. To help alleviate this, a number of computational tools have been developed which integrate gene expression data with molecular interaction networks in order to predict these factors. One such approach is Mogrify, an algorithm which ranks transcriptions factors based on their regulatory influence in different cell types and tissues. These ranks are then used to identify a nonredundant set of transcription factors to promote cell conversion between any two cell types/tissues. Here we summarize the important concepts and data sources that were used in the implementation of this approach. Furthermore, we describe how the associated web resource ( www.mogrify.net ) can be used to tailor predictions to specific experimental scenarios, for instance, limiting the set of possible transcription factors and including domain knowledge. Finally, we describe important considerations for the effective selection of reprogramming factors. We envision that such data-driven approaches will become commonplace in the field, rapidly accelerating the progress in stem cell biology.


Asunto(s)
Diferenciación Celular , Transdiferenciación Celular , Reprogramación Celular , Biología Computacional/métodos , Células Madre/citología , Células Madre/metabolismo , Factores de Transcripción/metabolismo , Algoritmos , Regulación de la Expresión Génica , Humanos , Dominios y Motivos de Interacción de Proteínas
15.
Nucleic Acids Res ; 47(10): 4970-4973, 2019 06 04.
Artículo en Inglés | MEDLINE | ID: mdl-30997511

RESUMEN

The alignment between the boundaries of protein domains and the boundaries of exons could provide evidence for the evolution of proteins via domain shuffling, but literature in the field has so far struggled to conclusively show this. Here, on larger data sets than previously possible, we do finally show that this phenomenon is indisputably found widely across the eukaryotic tree. In contrast, the alignment between exons and the boundaries of intrinsically disordered regions of proteins is not a general property of eukaryotes. Most interesting of all is the discovery that domain-exon alignment is much more common in recently evolved protein sequences than older ones.


Asunto(s)
Células Eucariotas/metabolismo , Exones/genética , Intrones/genética , Proteínas/genética , Animales , Evolución Molecular , Genoma/genética , Humanos
16.
Nucleic Acids Res ; 47(D1): D490-D494, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30445555

RESUMEN

Here, we present a major update to the SUPERFAMILY database and the webserver. We describe the addition of new SUPERFAMILY 2.0 profile HMM library containing a total of 27 623 HMMs. The database now includes Superfamily domain annotations for millions of protein sequences taken from the Universal Protein Recourse Knowledgebase (UniProtKB) and the National Center for Biotechnology Information (NCBI). This addition constitutes about 51 and 45 million distinct protein sequences obtained from UniProtKB and NCBI respectively. Currently, the database contains annotations for 63 244 and 102 151 complete genomes taken from UniProtKB and NCBI respectively. The current sequence collection and genome update is the biggest so far in the history of SUPERFAMILY updates. In order to the deal with the massive wealth of information, here we introduce a new SUPERFAMILY 2.0 webserver (http://supfam.org). Currently, the webserver mainly focuses on the search, retrieval and display of Superfamily annotation for the entire sequence and genome collection in the database.


Asunto(s)
Bases de Datos de Proteínas , Dominios Proteicos , Proteoma/química , Genoma , Internet , Cadenas de Markov , Dominios Proteicos/genética , Análisis de Secuencia de Proteína
17.
Nucleic Acids Res ; 47(D1): D351-D360, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30398656

RESUMEN

The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains and sites. Here, we report recent developments with InterPro (version 70.0) and its associated software, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website. These developments extend and enrich the information provided by InterPro, and provide greater flexibility in terms of data access. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB, and discuss how our evaluation of residue coverage may help guide future curation activities.


Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Animales , Bases de Datos Genéticas , Ontología de Genes , Humanos , Internet , Familia de Multigenes , Dominios Proteicos/genética , Homología de Secuencia de Aminoácido , Programas Informáticos , Interfaz Usuario-Computador
18.
Hum Mutat ; 38(9): 1266-1276, 2017 09.
Artículo en Inglés | MEDLINE | ID: mdl-28544481

RESUMEN

The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación Completa del Genoma/métodos , Área Bajo la Curva , Predisposición Genética a la Enfermedad , Proyecto Genoma Humano , Humanos , Fenotipo , Sitios de Carácter Cuantitativo
19.
Hum Mutat ; 38(9): 1042-1050, 2017 09.
Artículo en Inglés | MEDLINE | ID: mdl-28440912

RESUMEN

Correct phenotypic interpretation of variants of unknown significance for cancer-associated genes is a diagnostic challenge as genetic screenings gain in popularity in the next-generation sequencing era. The Critical Assessment of Genome Interpretation (CAGI) experiment aims to test and define the state of the art of genotype-phenotype interpretation. Here, we present the assessment of the CAGI p16INK4a challenge. Participants were asked to predict the effect on cellular proliferation of 10 variants for the p16INK4a tumor suppressor, a cyclin-dependent kinase inhibitor encoded by the CDKN2A gene. Twenty-two pathogenicity predictors were assessed with a variety of accuracy measures for reliability in a medical context. Different assessment measures were combined in an overall ranking to provide more robust results. The R scripts used for assessment are publicly available from a GitHub repository for future use in similar assessment exercises. Despite a limited test-set size, our findings show a variety of results, with some methods performing significantly better. Methods combining different strategies frequently outperform simpler approaches. The best predictor, Yang&Zhou lab, uses a machine learning method combining an empirical energy function measuring protein stability with an evolutionary conservation term. The p16INK4a challenge highlights how subtle structural effects can neutralize otherwise deleterious variants.


Asunto(s)
Biología Computacional/métodos , Inhibidor p18 de las Quinasas Dependientes de la Ciclina/genética , Variación Genética , Línea Celular Tumoral , Proliferación Celular , Simulación por Computador , Inhibidor p16 de la Quinasa Dependiente de Ciclina , Inhibidor p18 de las Quinasas Dependientes de la Ciclina/química , Bases de Datos Genéticas , Predisposición Genética a la Enfermedad , Humanos , Aprendizaje Automático , Estabilidad Proteica
20.
Nature ; 543(7644): 199-204, 2017 03 09.
Artículo en Inglés | MEDLINE | ID: mdl-28241135

RESUMEN

Long non-coding RNAs (lncRNAs) are largely heterogeneous and functionally uncharacterized. Here, using FANTOM5 cap analysis of gene expression (CAGE) data, we integrate multiple transcript collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5' ends and expression profiles across 1,829 samples from the major human primary cell types and tissues. Genomic and epigenomic classification of these lncRNAs reveals that most intergenic lncRNAs originate from enhancers rather than from promoters. Incorporating genetic and expression data, we show that lncRNAs overlapping trait-associated single nucleotide polymorphisms are specifically expressed in cell types relevant to the traits, implicating these lncRNAs in multiple diseases. We further demonstrate that lncRNAs overlapping expression quantitative trait loci (eQTL)-associated single nucleotide polymorphisms of messenger RNAs are co-expressed with the corresponding messenger RNAs, suggesting their potential roles in transcriptional regulation. Combining these findings with conservation data, we identify 19,175 potentially functional lncRNAs in the human genome.


Asunto(s)
Bases de Datos Genéticas , ARN Largo no Codificante/química , ARN Largo no Codificante/genética , Transcriptoma/genética , Células Cultivadas , Secuencia Conservada/genética , Conjuntos de Datos como Asunto , Elementos de Facilitación Genéticos/genética , Epigénesis Genética , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Genoma Humano/genética , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Internet , Anotación de Secuencia Molecular , Especificidad de Órganos/genética , Polimorfismo de Nucleótido Simple , Regiones Promotoras Genéticas/genética , Sitios de Carácter Cuantitativo/genética , Estabilidad del ARN , ARN Mensajero/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...